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1 PageCluster: Mining conceptual link hierarchies from Web log files for adaptive Web 
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Full text available: gpdf(280.84 KB) 



User traversals on hyperlinks between Web pages can reveal semantic relationships 
between these pages. We use user traversals on hyperlinks as weights to measure 
semantic relationships between Web pages. On the basis of these weights, we propose a 
novel method to put Web pages on a Web site onto different conceptual levels in a link 
hierarchy. We develop a clustering algorithm called PageCluster, which clusters 
conceptually-related pages on each conceptual level of the link hierarchy based on th ... 

Keywords: Link hierarchies, Web site navigation, bibliographic analysis, clustering, 
conceptual link hierarchies, link similarity 



2 Visualizing digital library search results with categorical and hierarchical axes 

Ben Shneiderman, David Feldman, Anne Rose, Xavier Ferre Grau 
June 2000 Proceedings of the fifth ACM conference on Digital libraries DL 'GO 
Publisher: ACM Press 

Additional Information: full citation, abstract , references, citings , index 



Full text available: g pdf(682.87 KB) 



terms 



Digital library search results are usually shown as a textual list, with 10-20 items per 
page. Viewing several thousand search results at once on a two-dimensional display with 
continuous variables is a promising alternative. Since these displays can overwhelm some 
users, we created a simplified two-dimensional display that uses categorical and 
hierarchical axes, called hieraxes. Users appreciate the meaningful and limited number of 
terms on each hieraxis. At each grid point ... 

Keywords: categorical axes, digital libraries, graphical user interfaces, hierarchy, 
hieraxes, information visualization 
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Full text available: fS Ddf(502.59 KB^ Additional Information: full citation , abstract, references , citings, index 
^ terms 

The announts of information residing on web sites make users' navigation a hard task. To 

address this problem, web sites provide recommendations to the end users, based on 
similar users' navigational patterns mined from past visits. In this paper we introduce a 
recommendation method, which integrates usage data recorded in web logs, and the 
conceptual relationships between web documents. In the proposed framework, the usage- 
oriented URI representation of web pages and users' behavior is augmen ... 

Keywords: concept hierarchies, semantic web mining, semantic web personalization, 
web content semantics 
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May 2002 Proceedings of the 11th international conference on World Wide Web 

WWW '02 
Publisher: ACM Press 

Full text available* 1l!| pdfn36.12 KB) Additional Information: full citation , abstract , references, citings, index 

terms 

The structure of the web is increasingly being used to improve organization, search, and 
analysis of information on the web. For example, Google uses the text in citing documents 
(documents that link to the target document) for search. We analyze the relative utility of 
document text, and the text in citing documents near the citation, for classification and 
description. Results show that the text in citing documents, when available, often has 
greater discriminative and descriptive power than th ... 

Keywords: SVM, anchortext, classification, cluster naming, entropy based feature 
extraction, evaluation, web directory, web structure 
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In this paper, we first describe a formative empirical study to inform the design of CSCW 
tools to support idea finding in co-located groups. Groups of students worked on creative 
problems with mapping and whiteboard tools in different work modes. Concluding from 
the results of the study, requirements are derived. A suite of tools that are informed by 
these requirements is presented along with typical scenarios of their usage. The suite 
consists of three software components covering a Mind-Mappi ... 

Keywords: co-located groups, computer-supported cooperative work, creativity, 
formative evaluation, human-computer interaction, idea finding, large displays, mind- 
mapping, personal digital assistants, production blocking, tool-suite 
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Knowledge discovery and data mining KDD '06 
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Full text available: ^pdf(910.76 KB) Additional Information: full citation , abstract , references, index terms 

•Hierarchical models have been shown to be effective In content classification. However, 
we observe through empirical study that the performance of a hierarchical model varies 
with given taxonomies; even a semantically sound taxonomy has potential to change its 
structure for better classification. By scrutinizing typical cases, we elucidate why a given 
semantics-based hierarchy does not work well in content classification, and how it could 
be Improved for accurate hierarchical classification. Wit ... 

Keywords: hierarchical classification, hierarchical modeling, taxonomy adjustment, text 
classification 



7 Cat-a-Cone: an interactive interface for specifying searches and viewing retrieval 
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October 1997 Proceedings of the 15th annual international conference on Computer 
documentation SIGDOC '97 
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This poster presents a general clustering-based algorithm for deriving presentation 
structure from semantic structure. Domain-independent presentation generation results 
from this algorithm. 

Keywords: clustering, document structure, hypermedia, presentation generation, 
semantics, style 
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World Wide Web WWW '05 

Publisher: ACM Press 

Full text available- 1© Ddf(514 22 KB) A^^'**^"^' Information: full citation , abstract , references , citings , index 
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In this paper we propose a hierarchical clustering engine, called snaket, that is able to 
organize on-the-fly the search results drawn from 16 commodity search engines into a 
hierarchy of labeled folders. The hierarchy offers a complementary view to the flat-ranked 
list of results returned by current search engines. Users can navigate through the 
hierarchy driven by their search needs. This is especially useful for informative,- 
polysemous and poor queries. SnakeT is the first complete an ... 

Keywords: information extraction, new search applications and interfaces, personalized 
web ranking, search engines, web snippets clustering 
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Publisher: ACM Press 

Full text available: ^ pdf(634.38 KB) Additional Information: full citation , abstract , references , index terms 

As information volume in enterprise systems and in the Web grows rapidly, how to 
accurately retrieve information is an important research area. Several corpus based 
smoothing techniques have been proposed to address the data sparsity and synonym 
problems faced by Information retrieval systems. Such smoothing techniques are often 
unable to discover and utilize the correlations among terms. We propose CVS, a 
Correlation-Verification based Smoothing method, that considers co-occurrence 
information i .:. 

Keywords: information retrieval, query expansion, smoothing, term clustering, text 
mining 
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We discuss our experiences in analyzing customer-support issues fronri the unstructured 
free-text fields of technical-support call logs. The identification of frequent issues and 
their accurate quantification is essential in order to track aggregate costs brol<en down by 
issue type, to appropriately target engineering resources, and to provide the best 
diagnosis, support and documentation for most common Issues. We present a new set of 
techniques for doing this efficiently on an Industrial scale, w ... 

Keywords: applications, log processing, quantification, supervised machine learning, text 
classification, text mining 
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Publisher: ACM Press 

Full text available: pdf(235.40 KB) Additional Information: full citation , abstract , references , index terms 

Incremental hierarchical text document clustering algorithms are important in organizing 
documents generated from streaming on-line sources, such as, Newswire and Blogs. 
However, this is a relatively unexplored area in the text document clustering literature. 
Popular incremental hierarchical clustering algorithms, namely Cobweb and Classit, have 
not been widely used with text document data. We discuss why. In the current form, 
these algorithms are not suitable for text clusterl ... 

Keywords: hierarchical clustering, incremental clustering, text clustering 
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November 1997 Proceedings of the 1997 conference of the Centre for Advanced 

Studies on Collaborative research CASCON '97 
Publisher: IBM Press 

Full text available: ^ pdf(4.21 MB) Additional Information: full citation , abstract , references , index terms 

Understanding distributed applications is a tedious and difficult task. Visualizations based 
on process-time diagrams are often used to obtain a better understanding of the 
execution of the application. The visualization tool we use is Poet, an event tracer 
developed at the University of Waterloo. However, these diagrams are often very complex 
and do not provide the user with the desired overview of the application. In our 
experience, such tools display repeated occurrences of non-trivial commun ... 
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March 2002 Proceedings of the 2002 ACM symposium on Applied computing SAC '02 
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Mobile clients have linnited display and navigation capabilities. To browse a set of 
docunnents, an intuitive method is to navigate through concept hierarchies. To reduce 
semantic loading for each term that represents the concepts and the cognitive loading of 
users due to the limited display, similar documents are grouped together before concept 
hierarchies are constructed for each document group. Since the concept hierarchies only 
represent the salient concepts in the documents, term extraction i ... 

Keywords: browsing, concept hierarchy, information access, mobile agent, mobile 
computing, navigation, summarization 
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^ terms 

From volume 1 Preface (See Front Matter for full Preface) 

This book is intended for a one or two semester course in compiling theory at the senior 
or graduate level. It is a theoretically oriented treatment of a practical subject. Our 
motivation for making it so is threefold. 

(1) In an area as rapidly changing as Computer Science, sound pedagogy demands that 
courses emphasize ideas, rather than implementation details. It is our hope that the 
algorithms and concepts presen ... 

World Wide Web: Using Markov models for web site link prediction 
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June 2002 Proceedings of the thirteenth ACM conference on Hypertext and 
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Markov models have been extensively used to nnodel Web users' navigation behaviors on 
Web sites. The link structure of a Web site can be seen as a citation network. By applying 
bibliographic co-citation and coupling analysis to a l^arkov model constructed from a Web 
log file on a Web site, we propose a clustering algorithm called CitationCluster to cluster 
conceptually related pages. The clustering results are used to construct a conceptual 
hierarchy of the Web site. Markov model based link predic ... 

Keywords: Markov models, hierarchy, link prediction 
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In this paper I outline Type-inheritance Combinatory Categorial Grammar (TCCG), an 
implemented feature structure based CCG fragment of English. TCCG combines the fully 
lexical nature of CCG with the type-inheritance hierarchies and complex feature structures 
of Headdriven Phrase Structure Grammars (HPSG). The result is a CCG/HPSG hybrid that 
combines linguistic generalizations previously only statable in one theory or the other, 
even extending the set of statable generalizations to those not eas ... 
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